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AMIDASE 

This invention relates zd newly identified 
poiynuc-eocides, polypeptides encoded by s'jch 
5 polynucleotides, zhe use of such polynucleotides and 
polypeptides, as well as zhe production and isolation of 
such polynucleotides and polypeptides. More 
particularly, the polypeptide of the present invention 
has been identified as an amidase and in particular an 
10 enzyrr.e having activity in the removal of arginine, 

phenylalanine or methionine frorrt the N-terminal end of 
peptides in peptide or peptidorr.imetic synthesis. 

Thermophilic bacteria have received considerable 
attention as sources of highly active and thern^.os table 

15 enzymes (Bronneomeier , K. and Staudenbauer , W.I., D.R. 
Woods (Ed.), The Clostridia and Biotechnology, 
Butterworth Publishers, Stoneham, MA (1993) . Recently, 
the most extremely thermophilic organotrophic eubacteria 
presently known have been isolated and characterized. 

20 These bacteria, which belong to the genus Thermotoga, are 
fermentative microorganisms metabolizing a variety of 
carbohydrates (Ruber, R, and Stetter, K.O., in Ballows, 
et al., (Ed.), The Procaryotes, 2nd Ed., Spr inger-Verlaz, 
New York, pgs. 3809-3819 (1992)). 

25 Because to date most organisms identified from the 

archaeal domain are thermophiles or hyperthermophiles, 
archaeal bacteria are also considered a fertile source of 
thermophilic enzymes . 
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SL^IMARY OF THE INVENTION 



In accordance with one aspect cf the present 
invention, there is provided a novel enzyme, as well as 
active fragments, analogs and derivatives; thereof. 



5 In accordance with another aspect of the present 

invention; there are provided isolated n^jcleic acid 
molecules encoding an enzyme of the present invention 
including mRNAs, DNAs, cDNAs, genomic DNAs as well as 
active analogs and fragments of such enzymes. 

10 In accordance with yet a further aspect of the 

present invention, there is provided a process for 
producing such polypeptide by recombinant techniques 
comprising culturing recorrJbinant prokaryotic and/or 
eukaryotic host cells, containing a nucleic acid sequence 

15 encoding an enzyme of the present invention, under 
conditions promoting expression of said enzyme and 
subsequent recovery of said enzyme. 

In accordance with yet a further aspect of the 
present invention, there is provided a process for 

20 utilizing such enzyme, or polynucleotide encoding such 
enzyme. The enzyme is useful for the removal of 
arginme, phenylalanine, or methionine amino acids from 
the N-terminal end of peptides in peptide or 
peptidomimetic synthesis. The enzyme is selective for 

25 Che L, or "natural" enantiomer of the amino acid 

derivatives and is therefore useful for the production of 
optically active compounds. These reactions can be 
performed in the presence of the chemically more reactive 
ester functionality, a step which is very difficult to 
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achieve with r.onenzymat ic ?:iez'riQcs . The enzy.-ne is also 
able ro tolerate high tenperatures (at leas:: "fO°C) , and 
high concentrations of organic solvents {>^D% DMSO) , both 
of which cause a disruption of secondary structure m 
5 peptides; this enables cleavage of otherwise resistant 
bonds . 

In accordance with yet a further aspect of the 
present invention, there is also provided nucleic acid 
probes comprising nucleic acid molecules of sufficient 
10 length to specifically hybridize to a nucleic acid 
sequence of the present invention. 

In accordance with yet a further aspect of the 
present invention, there is provided a process for 
utilizing such enzymes, or polynucleotides encoding such 
15 enzymes, for in vitro purposes related to scientific 

research, for example, to generate probes for identifying 
similar sequences which might encode similar enzym.es from 
other organisms. 



These and other aspects of the present invention 
20 should be apparent to those skilled in the art from the 
teachings herein. 
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BRIEF DESCRIPTION OF THE DRAWTNn.<:; 

The following drawings are illustrative of 
embodiments of the invention and are not rc.eanz to limit 
the scope cf the inv-anrion as encompassed cy the claims. 

Figure 1 is an illustration cf the full-length DNA 
and corresponding deduced amino acid sequence of the 
enzyme of the present invention. Sequencing was 
performed using a 378 automated DNA sequencer {Applied 
Biosystems, Inc , ) . 



Figure 2 shows the fluorescence versus 
concentration of DMSO. The filled and open boxes 
represent individual assays from Example 3. 

Figure 3 shows the relative initial linear rates 
(increase in fluorescence per min. i.e. "activity") 
5 versus concentration of DMF for the more reactive C32-L- 
arg-AMC, from Example 3. 
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DETAILED DESCRIPTION OF THE INVENTION 

The term "gene" means the segment of DNA involved 
in producing a polypeptide chain; it includes regions 
preceding and following the coding region (leader and 
5 trailer) as well as intervening sequences (introns) 
between individual coding segments (exonsj . 

A coding sequence is "operably linked to*' another 
coding sequence when RNA polymerase will transcribe the 
two coding sequences into a single mRNA, which is then 
10 translated into a single polypeptide having amino acids 
derived from both coding sequences. The coding sequences 
need not be contiguous to one another so long as the 
expressed sequences are ultimately processed to produce 
the desired protein. 

15 "Recombinant" enzymes refer to enzymes produced by 

recorrJbinant DNA techniques; i.e., produced from cells 
transformed by an exogenous DNA construct encoding the 
desired enzym.e. "Synthetic" enzymes are those prepared 
by chemical synthesis. 

20 The present invention provides substantially pure 

amidase enzymes. The term "substantially pure" is used 
herein to describe a molecule, such as a polypeptide 
(e.g., an amidase polypeptide, or a fragment thereof) 
that is substantially free of other proteins, lipids, 

25 carbohydrates, nucleic acids, and other biological 

materials with which it is naturally associated. For 
example, a substantially pure molecule, such as a 
polypeptide, can be at least 60%, by dry weight, the 
molecule of interest. The purity of the polypeptides can 



wo 97/48794 



PCT/US97/09319 



- 6 - 

be del-er.T.inec using standard xechods including, e,g., 
polyacrylarr.ide gel eiectrcphcresis [e.g., 5DS-PAGE), 
column chronacography (e.g., high performance liquid 
chromatiography ;h?LC;:, and amino- tenr.inal ar.mo acid 
5 sequence analysis. 

A DNA "coding sequence of" or a "nucleotide 
sequence encoding" a particular enzyme, is a DNA sequence 
which is transcribed and translated inro an enzyme when 
placed under the control of appropriate regula::ory 
10 sequences. A "promoter sequence" is a DNA regulatory 
region capable of binding RNA polymerase in a cell and 
initiating transcription of a downstream (3' direction) 
coding sequence. The promoter is part of the DNA 
sequence. This sequence region has a start codon at its 
15 3' terminus. The promoter sequence does include the 
minimum number of bases where elements necessary to 
initiate transcription at levels detectable above 
background. However, after the RNA polymerase binds the 
sequence and transcription is initiated at the start 
20 codon (3' terminus with a promoter), transcription 
proceeds downstrea.m m the 3' direction. Within the 
prorr.otor sequence will be found a transcription 
initiation site (conveniently defined by mapping with 
nuclease SI) as well as protein binding domains 
25 (consensus sequences) responsible for the binding of RNA 
polymerase . 

The present invention provides a purified 
thermostable enzyme that catalyzes the removal of 
arginine, phenylalanine, or methionine amino acids from 
30 the N-terminal end of peptides in peptide or 

peptidomimetic synthesis. The purified enzyme is an 
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a.Tvidase derived from an organism referred no herein as 
" Thermococcus GU5L5" which is a thermophilic archaeal 
organism which has a very high temperature oprimumi. The 
organism is strictly anaerobic and grows between 55 and 
5 90°C (optimally at 85°C) . GU5L5 was discovered in a 

shallow rriarine hydrothermal area in Vulcano, Italy. The 
organism has coccoid cells occurring in singlets or 
pairs. GU5L5 grows optimally at 85°C and pH 6.0 in a 
marine medium with peptone as a substrate and nitrogen in 
10 gas phase. 

The polynucleotide of this invention was 
originally recovered from a genomic gene library derived 
from Thermococcus GU5L5 as described below. It contains 
an open reading frame encoding a protein of 622 amino 
15 acid residues. 

In a preferred embodiment, the amidase enzyme of 
the present invention has a molecular weight of about 
68.5 kilodaltons as inferred from the nucleotide sequence 
of the gene. 

20 In accordance with an aspect of the present 

invention, there are provided isolated nucleic acid 
molecules (polynucleotides) which encode for -the mature 
enzyme having the deduced amino acid sequence of Figure 1 
(SEQ ID N0:2) . 

25 This invention, in addition to the isolated 

nucleic acid molecule encoding an amidase enzyme 
disclosed in Figure 1 (SEQ ID N0:1), also provides 
substantially similar sequences. Isolated nucleic acid 
sequences are substantially similar if: (i) they are 
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capable of hybridizing under stringent conditions, 
hereinafter described, zo SEQ ID NO:!; or li.) they 
encode UNA sequences which are degenerate to SEQ id N0:1 
Degenerate DKA sequences encode the amino acid sequence 
5 of SEQ ID K0:2; buz have variations in the nucleotide 
coding sequences. As used herein, "subscan:ialiy 
similar" refers to the sequences having similar identity 
to the sequences of the instant invention. The 
nucleotide sequences that are substantially similar can 
10 be identified by hybridization or by sequence comparison. 
Enzyme sequences that are substantially similar can be 
identified by one or more of the following: proteolytic 
digestion, gel electrophoresis and/or microsequencmg . 

One means for isolating a nucleic acid molecule 
15 encoding an amidase enzyme is to probe a gene library 
with a natural or artificially designed probe using art 
recognized procedures (see, for example: Current 
Protocols in Molecular Biology, Ausubel F.M. et aJ , 
(EDS.) Green Publishing Company Assoc. and John Wiley 
20 Interscience, New York, 1989, 1992). It is appreciated 
to one skilled in the art that SEQ ID NO:!, or fragments 
thereof (comprising at least 15 contiguous nucleotides), 
is a particularly useful probe. Other particular useful 
probes for this purpose are hybridizable fragm.ents to the 
25 sequences of SEQ ID N0:1 (i.e., comprising at least 15 
contiguous nucleotides) . 

With respect to nucleic acid sequences which 
hybridize to specific nucleic acid sequences disclosed 
herein, hybridization may be carried out under conditions 
30 of reduced stringency, medium stringency or even 

stringent conditions. As an example of oligonucleotide 
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hybridization, a polymer meirxrane ccntaining immobilized 
denatured nucleic acid is first prehybr idi zed for 3C 
minutes at 45°C in a solution consisting of 0.9 M NaCi, 
50 m^: NaH.PC^, pH "^.0, 5.0 mM Na^EDTA, 0.5'^ SDS, lOX 
5 Denhardt's, and 0.5 mg/ml pclyriboadenyl ic acid. 

Approximately 2 X .J' cpm (specific activity ^-9 X IC^^ 
cpm/ug) of ^^P end-labeled oligonucleotide probe are then 
added to the solution. After 12-16 hours of incubation, 
the membrane is washed for 30 minutes at room temperature 
10 in IX SET (15C mM NaCl, 20 mM Tris hydrochloride, pH 7.8, 
1 mM Na;,EDTA) containing 0.5^ SDS, followed by a 30 minute 
wash in fresh IX SET at Tm-10°C for the ol igo-nucl eotide 
probe. The membrane is then exposed to auto-radiographic 
film for detection of hybridization signals. 

15 Stringent conditions means hybridization will 

occur only if there is at least 90% identity, preferably 
at least 95% identity and most preferably at least 97% 
identity between the sequences. See J. Sambrook et a J . , 
Molecular Cloning, A Laboratory Manual (2d Ed. 1989) 

20 (Cold Spring Harbor Laboratory) which is hereby 
incorporated by reference in its entirety. 

"Identity" as the term is used herein, refers to a 
polynucleotide sequence which comprises a percentage of 
the same bases as a reference polynucleotide (SEQ ID 

25 N0:1). For example, a polynucleotide which is at least 
90% identical to a reference polynucleotide, has 
polynucleotide bases which are identical in 90% of the 
bases which make up the reference polynucleotide and may 
have different bases in 10% of the bases which comprise 

30 that polynucleotide sequence. 
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The present ir.vention also relates cc 
polynucleotides which differ frorr. zhe reference 
polynucleccide such that the changes are silent changes, 
for example the changes do not alter the a.Tiino acid 
5 sequence encoded by the polynucleotide. The present 

invention also relates to nucleotide changes which result 
in amino acid substitutions, additions, deletions, 
fusions and truncations in the enzyme encoded by the 
reference polynucleotide (3EQ ID NO:i;. In a preferred 
10 aspect of the invention these enzymes retain the same 
biological action as the enzyme encoded by the reference 
polynucleotide . 



It is also appreciated that such probes can be and 
are preferably labeled with an analytically detectable 

15 reagent to facilitate identification of the probe. 
Useful reagents include but are not limited to 
radioactivity, fluorescent dyes or enzymes capable of 
catalyzing the formation of a detectable product. The 
probes are thus useful to isolate complementary copies of 

20 DNA from other animal sources or to screen such sources 
for related sequences. 



The coding sequence for the am.idase enzyme of the 
present invention was identified by preparing a 
Thermococcus GU5L5 genomic DNA library and screening the 

25 library for the clones having am.idase activity. Such 

methods for constructing a genom.ic gene library are well- 
known in the art. One means, for example, comprises 
shearing DNA isolated from GU5L5 by physical disruption. 
A small amount of the sheared DNA is checked on an 

30 agarose gel to verify that the majority of the DNA is in 
the desired size range (approximately 3-6 kb) . The DNA 
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is then blunt ended using Mung Bear. Nuclease, incubated 
at 3'^*'C and phenol/chloroform extracted. The DNA is then 
methylated using Eco RI Mechylase. Eco Rl linkers are 
then ligated to the blunt ends through the use of DNA 
5 ligase and incubation at 4°C. The ligation reaction is 
then terTiinated and zhe DNA is cut-back with Eco Rl 
restriction enzyme. The DNA is then size fractionated on 
a sucrose gradient following procedures known in the art, 
for example, Maniatis, T., ec ai., Molecular Cloning. 
Cold Spring Harbor Press, New York, 1982, which is hereby 
incorporated by reference in its entirety. 

A plate assay is then performed tc get an 
approximate concentration of the DNA, Ligation reactions 
are then performed and 1 ul of the ligation reaction is 
packaged to construct a library. Packaging, for example, 
may occur through the use of purified Agtll phage arms 
cut with EcoRI and DNA cut with EcoRI after attaching 
EcoRI linkers. The DNA and Xgtll arms are ligated with 
DNA ligase. The ligated DNA is then packaged into 
infectious phage particles. The packaged phages are used 
to infect E, coli cultures and the infected cells are 
spread on agar plates to yield plates carrying thousands 
of individual phage plaques. The library is then 
amplified. 

Fragments of the full length gene of the present 
invention may be used as a hybridization probe for a cDNA 
or a genomic library to isolate the full length DNA and 
to isolate other DNAs which have a high sequence 
similarity to the gene or similar biological activity. 
Probes of this type have at least 10, preferably at least 
15, and even more preferably at least 30 bases and may 
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contain, for example, at leas: 50 or mere bases. The 
probe may also be used to identify a DNA clone 
corresponding to a full length transcript and a genomic 
clone or clones that contain the complete gene inciudinc 
5 regulatory and promoter regions, excns, anc introns. 



The isolated nucleic acid sequences and other 
enzymes may then be measured for retention of biological 
activity characteristic to the enzyme of the present 
invention, for example, in an assay for detecting 
10 enzymatic amidase activity. Such enzymes include 

truncated forms of amidase, and variants such as deletion 
and insertion variants. 



The polynucleotide of the present invention may be 
in the form of DNA which DNA includes cDNA, genomic DNA, 

15 and synthetic DNA. The DNA may be double-stranded or 

single-stranded, and if single stranded may be the coding 
strand or non-coding (anti-sense) strand. The coding 
sequence which encodes the mature enzyme may be identical 
to the coding sequence shown in Figure 1 (SEQ ID N0:1) 

20 and/or that of the deposited clone or may be a different 
coding sequence which coding sequence, as a result of the 
redundancy or degeneracy of the genetic code, encodes the 
same mature enzyme as the DNA of Figure i (SEQ ID N0;1). 

The polynucleotide which encodes for the mature 
25 enzyme of Figure 1 (SEQ ID N0:2) may include, but is not 
limited to: only the coding sequence for the mature 
enzyme; the coding sequence for the mature enzyme and 
additional coding sequence such as a leader sequence or a 
proprotein sequence; the coding sequence for the mature 
30 enzyme (and optionally additional coding sequence) and 
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non-ccding sequence, such as introns or non-coding 
sequence 5' and/or 5' of the coding sequence for the 
mature enzyme. 

Thus, zhe term "polynucleotide encoding an enzyme 
5 (protein) " encompasses a polynucleotide which includes 
only coding sequence for the enzyme as well as a 
polynucleotide which includes additional coding and/or 
non-coding sequence . 



The present invention further relates to variants 
10 of the hereinabove described polynucleotides which encode 
for fragments, analogs and derivatives of the enzyme 
having the deduced amino acid sequence of Figure 1 (SEQ 
ID N0:2) . The variant of the polynucleotide may be a 
naturally occurring allelic variant of the polynucleotide 
15 or a non-naturally occurring variant of the 
polynucleotide . 



Thus, the present invention includes 
polynucleotides encoding the same mature enzyme as shown 
in Figure 1 {SEQ ID NO; 2) as well as variants of such 
20 polynucleotides which variants encode for a fragment, 
derivative or analog of the enzyme of Figure 1 (SEQ ID 
N0:2). Such nucleotide variants include deletion 
variants, substitution variants and addition or insertion 
variants . 



25 As hereinabove indicated, the polynucleotide may 

have a coding sequence which is a naturally occurring 
allelic variant of the coding sequence shown in Figure 1 
(SEQ ID N0:1). As known in the art, an allelic variant 
is an alternate form of a polynucleotide sequence which 
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may have a substitution, deletion or addition of one or 
nore nucleotides, which does r.ot suostantial ly alter the 
function of the encoded enzyxrie. 

The present invention also includes 
5 polynucleotides, wherein the coding sequence for the 

mature enzyine :nay be fused in the same reading frame to a 
polynucleotide sequence which aids in expression and 
secretion of an enzyme from a host cell, for example, a 
leader sequence which functions zo control transport of 
10 an enzyme from the cell. The enzyme having a leader 

sequence is a preprotein and may have the leader sequence 
cleaved by the host cell to form the mature form of the 
enzyme. The polynucleotides may also encode for a 
proprotein which is the mature protein plus additional 5» 
15 amino acid residues, A mature protein having a 

prosequence is a proprotein and is an inactive form of 
the protein. Once the prosequence is cleaved an active 
mature protein remains. 

Thus, for example, the polynucleotide of the 
20 present invention may encode for a mature enzyme, or for 
an enzyme having a prosequence or for an enzyme having 
both a prosequence and a presequence (leader sequence) . 

The present invention further relates to 
polynucleotides which hybridize to the heremabove- 

25 described sequences if there is at least 70%, preferably 
at least 90%, and more preferably at least 95% identity 
between the sequences. The present invention 
particularly relates to polynucleotides which hybridize 
under stringent conditions to the hereinabove-described 

30 polynucleotides. As herein used, the term "stringent 
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ccnditions" neans hybridization wiii occur only if there 
is at least 95^ and preferably az least B'^l identity 
between the sequences. The polynucleotides which 
hybridise zo the hereinabove described polynucleotides in 
5 a preferred embodiment encode enzymes which either retain 
substantially the same biological function or activity as 
the mature enzyme encoded by the DKA of Figure 1 (SEQ ID 
NO: 1) . 

Alternatively, the polynucleotide may have at 
10 least 15 bases, preferably at least 30 bases, and more 
preferably at least 50 bases which hybridize to a 
polynucleotide of the present invention and which has an 
identity thereto, as hereinabove described, and which may 
or may not retain activity. For example, such 
15 polynucleotides may be employed as probes for the 

polynucleotide of SEQ ID N0:1, for example, for recovery 
of the polynucleotide or as a PGR primer. 

Thus, the present invention is directed to 
polynucleotides having at least a 10% identity, 

20 preferably at least 90% identity and more preferably at 
least a 95% identity to a polynucleotide which encodes 
the enzyme of SEQ ID NO: 2 as well as fragments thereof, 
which fragments have at least 30 bases and preferably at 
least 50 bases and to enzymes encoded by such 

2 5 polynucleotides . 

The present invention further relates to a enzyme 
which has the deduced amino acid sequence of Figure 1 
(SEQ ID N0:2), as well as fragments, analogs and 
derivatives of such enzyme. 
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The tierns " f rag.-^ient , " "derivative" and "analog" 
when referring to the enzyme of Figure 1 (SEQ ID NO: 2) 
means a enzyme which retains essentially the same 
biological function or activity as such enzyme. Thus, an 
5 analog includes a proprotein which can be activated by 
cleavage of the proprotein portion tc produce an active 
mature enzyme. 



The enzyme of the present invention may be a 
recombinant enzyme, a natural enzyme or a synthetic 
10 enzyme, preferably a recombinant enzyme. 



The fragment, derivative or analog of the enzyme 
of Figure 1 (SEQ ID N0;2) may be (i) one in which one or 
more of the amino acid residues are substituted with a 
conserved or non-conserved amino acid residue (preferably 

15 a conserved amino acid residue) and such substituted 

amino acid residue may or may not be one encoded by the 
genetic code, or (ii) one in which one or m.ore of the 
amino acid residues includes a substituent group, or 
(iii) one in which the mature enzyme is fused with 

20 another compound, such as a compound to increase the 
half-life of the enzyme (for example, polyethylene 
glycol); or (iv) one in which the additional amino acids 
are fused to the mature enzyme, such as a leader or 
secretory sequence or a sequence which is employed for 

25 purification of the mature enzyme or a proprotein 

sequence. Such fragments, derivatives and analogs are 
deemed to be within the scope of those skilled in the art 
from the teachings herein. 
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The enzyrr.es and pclynucieor ides of the present 
invention are preferably provided m an isolaued form, 
and preferably are purified to homogeneity. 



The term "isolated" means that the material is 
5 rer.oved from its original environment (e.g., the natural 
environment if it is naturally occurring; . For example, 
a naturally-occurring polynucleotide or enzyme present in 
a living animal is not isolated, but the same 
polynucleotide or enzyme, separated from some or all of 
10 the coexisting materials in the natural system, is 

isolated. Such polynucleotides could be part of a vector 
and/or such polynucleotides or enzymes could be part of a 
composition, and still be isolated in that such vector or 
composition is not part of its natural environment. 

15 The enzymes of the present invention include the 

enzyme of SEQ ID NO: 2 (in particular the mature enzyme) 
as well as enzymes which have at least 70% similarity 
(preferably at least 70% identity) to the enzyme of SEQ 
ID NO: 2 and more preferably at least 90% similarity (more 

20 preferably at least 90% identity) to the enzyme of SEQ ID 
NO: 2 and still more preferably at least 95% similarity 
(still more preferably at least 95% identity) to the 
enzyme of SEQ ID NO: 2 and also include portions of such 
enzymes with such portion of the enzyme generally 

25 containing at least 30 amino acids and more preferably at 
least 50 amino acids. 



As known in the art "similarity" between two 
enzymes is determined by comparing the amino acid 
sequence and its conserved amino acid substitutes of one 
30 enzyme to the sequence of a second enzyme. Similarity 
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r.ay be dezerrrdr.ed by procedures which are well-known m 
the an, for exanple, a BLAST program (Basic Local 
Alignment Search Tool at zhe National Center for 
Biological Information) . 

5 A variant, i.e. a "fragr.enc", "analog" or 

"derivative" enzyme, and reference enzyme may differ in 
amino acid sequence by one or more substitutions, 
additions, deletions, fusions and truncations, which may 
be present ^n any combination. 

10 Among preferred variants are those that vary from 

a reference by conservative amino acid substitutions. 
Such substitutions are those that substitute a given 
amino acid in a polypeptide by another amino acid of like 
characteristics. Typically seen as conservative 

15 substitutions are the replacements, one for another, 
among the aliphatic amino acids Ala, Val, Leu and lie; 
interchange of the hydroxyl residues Ser and Thr, 
exchange of the acidic residues Asp and Glu, substitution 
between the amide residues Asn and Gin, exchange of the 

20 basic residues Lys and Arg and replacements among the 
aromatic residues Phe, Tyr. 

Most highly preferred are variants which retain 
the same biological function and activity as the 
reference polypeptide from which it varies. 

25 Fragments or portions of the enzymes of the 

present invention may be employed for producing the 
corresponding full-length enzyme by peptide synthesis; 
therefore, the fragments may be employed as intermediates 
for producing the full-length enzymes. Fragments or 
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porticns of t:::e polynucieo tides of zhe present invention 
may be used to synthesize full-length polynucleotides of 
the present invention. 

The present invention also relates to vectors 
5 which include polynucleotides of the present invention, 
host ceils which are genetically engineered with vectors 
of the invention and the production of enzymes of the 
invention by recombinant techniques. 

Host ceils are genetically engineered (transduced 
10 or transformed or transfected) with the vectors 

containing the polynucleotides of this invention. Such 
vectors may be, for example, a cloning vector or an 
expression vector. The vector may be, for example, in 
the form, of a plasmid, a viral particle, a phage, etc. 
15 The engineered host cells can be cultured in conventional 
nutrient media modified as appropriate for activating 
promoters, selecting transf ormants or amplifying the 
genes of the present invention. The culture conditions, 
such as temperature, pH and the like, are those 
20 previously used with the host cell selected for 
expression, and will be apparent to the ordinarily 
skilled artisan. 

The polynucleotides of the present invention may 
be employed for producing enzyir.es by recombinant 

25 techniques. Thus, for example, the polynucleotide may be 
included in any one of a variety of expression vectors 
for expressing an enzyme. Such vectors include 
chromosomal, nonchromosomal and synthetic DNA sequences, 
e.g., derivatives of SV40; bacterial plasmids; phage DNA; 

30 baculovirus ; yeast plasmids; vectors derived from 
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combinations of piasmids and phage DNA, viral DNA sjch as 
vaccinia, adenovirus; fowl pox virus, and pseudorabies . 
However, any other vector may be used as long as :t is 
replicable anc viable in the host. 

5 The appropri :te DNA sequence may be inserted into 

the vector by a variety of procedures. In general, the 
DNA sequence is inserted into an appropriate restriction 
endonuclease site(s) by procedures known in the art. 
Such procedures and others are deemed to be within the 
10 scope of those skilled in the art. 

The DNA sequence in the expression vector is 
operatively linked to an appropriate expression control 
sequence (s) (promoter) to direct mRNA synthesis. As 
representative examples of such promoters, there may be 

15 mentioned: LTR or SV40 promoter, the E. coll. lac or trp, 
the phage lambda Pl promoter and other promoters known to 
wontrol expression of genes in prokaryotic or eukaryotic 
cells or their viruses. The expression vector also 
contains a ribosome binding site for translation 

20 initiation and a transcription terminator. The vector 
may also include appropriate sequences for amplifying 
expression . 

In addition, the expression vectors preferably 
contain one or more selectable marker genes to provide a 
25 phenotypic trait for selection of transformed host ceils 
such as dihydrof olate reductase or neomycin resistance 
for eukaryotic cell culture, or such as tetracycline or 
ampicillin resistance in £. coli. 



The vector containing the appropriate DNA sequence 
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as hereinabove described, as well as an appropriate 
prcnoter or control sequence, nay be erupioyed to 
transform an appropriate host to perr.it the host to 
express the protein. 

5 As representative examples of appropriate hosts, 

there may be r.entioned: bacterial cells, such as coli, 
Strepiomyces, Bacillus subtllis; fungal ceils, such as 
yeast; insect cells such as Drosophila 52 and Spodoptera 
Sf9; animal cells such as CKO, COS or Bowes melanoma; 
10 adenoviruses; plant cells, etc. The selection of an 
appropriate host is deemed to be within the scope cf 
those skilled in the art from the teachings herein. 

More particularly, the present invention also 
includes recombinant constructs comprising one or more of 

15 the sequences as broadly described above. The constructs 
comprise a vector, such as a piasmid or viral vector, 
into which a sequence of the invention has been inserted, 
in a forward or reverse orientation. In a preferred 
aspect of this embodiment, the construct further 

20 comprises regulatory sequences, including, for example, a 
promoter, operably linked to the sequence. Large numJ^ers 
of suitable vectors and promoters are known to those of 
skill in the art, and are commercially available. The 
following vectors are provided by way of example; 

25 Bacterial: pQE70, pQE60, pQE-9 (Qiagen) , pBluescript II 
(Stratagene) ; pTRC99a, pKK223-3, pDR540, pRIT2T 
(Pharmacia); Eukaryotic: pXTl, pSG5 (Stratagene) pSVK3, 
pBPV, pMSG, pSVLSV40 (Pharmacia) . However, any other 
piasmid or vector m.ay be used as long as they are 

30 repiicable and viable in the host. 
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Prcr.oter regions can be selected from any desired 
gene using CAT f chlorair.phenicoi transferase: vectors or 
orher vectors with selectable .Tarkers. Two appropriate 
vectors are pKK232-'3- and pCMV . Particular nar.ed 
5 bacterial promoters include lad, lacZ, 73, T7, gpt, 
lambda Pp, and trp. Eukaryotic promoters include CMV 
immediate early, HSV thymidine kinase, early and late 
SV40/ LTRs from retrovirus, and mouse metallothionein-I . 
Selection of the appropriate vector and promoter is well 
10 withm the level of ordinary skill in the art. 

In a further embodiment; the present invention 
relates to host ceils containing the above-described 
constructs. The host cell can be a higher eukaryotic 
cell, such as a mamjnalian cell, or a lower eukaryotic 

15 cell, such as a yeast cell, or the host cell can be a 
prokaryotic ceil, such as a bacterial cell. Introduction 
of the construct into the host cell can be effected by 
calcium phosphate trans fection, DEAE-Dextran mediated 
transfection, or electroporation (Davis, L., Dibner, M., 

20 Battey, I., Basic Methods in Molecular Biology, (1986)). 

The constructs in host cells can be used in a 
conventional manner to produce the gene product encoded 
by the recombinant sequence. Alternatively, the enzymes 
of the invention can be synthetically produced by 
25 conventional peptide synthesizers. 

Mature proteins can be expressed m mammalian 
cells, yeast, bacteria, or other ceils under the control 
of appropriate promoters. Cell-free translation systems 
can also be employed to produce such proteins using RNAs 
30 derived from the DNA constructs of the present invention. 
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Apprcpriaze cloning and expression vectors for use with 
prokaryotic and eukaryotic hosts are described by 
Sar-brock ec al., Molecular Cloning: A Laboratory Manual, 
Second Edition, Cold Spring Harbor, N.Y., (1989), the 
5 disclosure of which is hereby incorporated by reference. 

Transcription of the DNA encoding the enzymes of 
the present invention by higher eukaryotes is increased 
by inserting an enhancer sequence into the vector. 
Enhancers are cis-acting elements of DNA, usually about 

10 from 10 to 300 bp that act on a promoter to increase its 
transcription. Examples include the SV40 enhancer on the 
late side of the replication origin bp 100 to 270, a 
cytomegalovirus early promoter enhancer, the polyoma 
enhancer on the late side of the replication origin, and 

15 adenovirus enhancers. 

Generally, recombinant expression vectors will 
include origins of replication and selectable markers 
permitting transformation of the host cell, e.g., the 
ampicillin resistance gene of E. coli and 5. cerevisiae 

20 TRPl gene, and a promoter derived from a highly-expressed 
gene to direct transcription of a downstream structural 
sequence. Such promoters can be derived from operons 
encoding glycolytic enzymes such as 3-phosphoglycerate 
kinase (PGK), a-factor, acid phosphatase, or heat shock 

25 proteins, among others. The heterologous structural 
sequence is assembled in appropriate phase with 
translation initiation and termination sequences, and 
preferably, a leader sequence capable of directing 
secretion of translated enzyme. Optionally, the 

30 heterologous sequence can encode a fusion enzyme 

including an N-terminal identification peptide imparting 



wo 97/48794 



PCT/IJS97/09319 



- 24 - 

desired charac':eristics, e.g., stabi 1 1 zat: icn cr 
sirr.plified purification of expressed reccnbinanr produce. 

Useful expression vectors for bacterial use are 
constructed by inserting a structural DNA sequence 
5 encoding a desired protein together with suitable 
translation initiation and ternination signals :n 
operable reading phase with a functional promoter. The 
vector will comprise one or more phenotypic selectable 
markers and an origin of replication to ensure 

10 maintenance of the vector and to, if desirable, provide 
amplification within the host. Suitable prokaryotic 
hosts for transformation include E, coll. Bacillus 
subtllis r Salmonella typhimurivm and various species 
within the genera Pseudomonas, Streptomyces , and 

15 Staphylococcus, although others may also be employed as a 
matter of choice. 



As a representative but nonlimiting example, 
useful expression vectors for bacterial use can comprise 
a selectable marker and bacterial origin of replication 

20 derived from commercially available plasmids comprising 
genetic elements of the well known cloning vector pBR322 
(ATCC 37017) . Such commercial vectors include, for 
example, pKK223-3 (Pharmacia Fine Chemicals, Uppsala, 
Sweden) and GEMl (Promega Biotec, Madison, WI, USA). 

25 These pBR322 "backbone" sections are combined with an 
appropriate promoter and the structural sequence to be 
expressed. 



Following transformation of a suitable host strain 
and growth of the host strain to an appropriate cell 
30 density, the selected promoter is induced by appropriate 
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means (e.g., temperar ure shifi or ^he.Dicai induction) and 
cells are cultured for an addicional period. 

Cells are typically harvested by centrifugation, 
disrupted by physical or chemical means, and the 
5 resulting crude extract retained for further 
puri f icaticn . 

Microbial cells employed in expression of proteins 
can be disrupted by any convenient method, including 
freeze-thaw cycling, sonication, mechanical disruption, 
10 or use of cell lysing agents, such methods are well known 
to those skilled in the art. 

Various mammalian cell culture systems can also be 
employed to express recombinant protein. Examples of 
mammalian expression systems include the COS-7 lines of 

15 monkey kidney fibroblasts, described by Gluzman, Cell, 

2?: 175 (1981), and other cell lines capable of expressing 
a compatible vector, for example, the C127, 3T3, CHO, 
HeLa and 3HK ceil lines. Mammalian expression vectors 
will comprise an origin of replication, a suitable 

20 promoter and enhancer, and also any necessary ribosome 
binding sites, polyadenylation site, splice donor and 
acceptor sites, transcriptional termination sequences, 
and 5' flanking nontranscribed sequences. DNA sequences 
derived from the SV40 splice, and polyadenylation sites 

25 may be used to provide the required nontranscribed 
genetic elements. 

The enzyme can be recovered and purified from 
recombinant cell cultures by methods including ammonium 
sulfate or ethanol precipitation, acid extraction, anion 



wo 97/48794 



PCT/US97/09319 



- 26 - 

cr canon exchange chromarcgrapny , phosphccellulose 
chromatography, hydrophobic inceraction chromatography, 
affinity chrcmacography , hydrox y i apa 1 1 1 e on rcp.a tography 
and iectm chroma !:cgraphy . Protein refolding steps can 
5 be used; as necessary, m completing configuration of th^ 
mature protein. Finally, high performance liquid 
chromatography (HPLC) can be eir.ployed for final 
purification steps . 

The enzymes of the present invention may be a 
10 naturally purified product, or a product of chemical 
synthetic procedures, or produced by recombinant 
techniques from a prokaryotic or eukaryotic host (for 
example, by bacterial, yeast, higher plant, insect and 
mammalian cells in culture) . Depending upon the host 
15 employed in a recombinant production procedure, the 

enzymes of the present invention may be glycosylated or 
may be non-glycosylated. Enzymes of the invention may or 
may not also include an initial methionine amino acid 
residue . 

20 The enzymes, their fragments or other derivatives, 

or analogs thereof, or ceils expressing them can be used 
as an immunogen to produce antibodies thereto. These 
antibodies can be, for example, polyclonal or monoclonal 
antibodies. The pre^^ent invention also includes 

25 chimeric, single chain, and humanized antibodies, as well 
as Fab fragments, or the product of an Fab expression 
library. Various procedures known in the art may be used 
for the production of such antibodies anc fragments. 

Antibodies generated against the enzymes 
30 corresponding to a sequence of the present invention can 



wo 97/48794 



PCT/US97/09319 



- 27 - 

oe obtained by direct injection of zhe er.zynes into an 
animal or by administering the enzymes zo an animal, 
preferably a nonhuman. The antibody so obtained will 
then bine the enzymes itself. In this manner, even a 
5 secuence encoding only a fragment of the enzymes can be 
used to generate antibodies binding the whole native 
enzymes. Such antibodies can then be used to isolate the 
enzyme from cells expressing that enzyme. 

For preparation of monoclonal antibodies, any 
technique which provides antibodies produced by 
continuous cell line cultures can be used. Examples 
include the hybridoma technique {Kohier and Milstein, 
1975, Nature, 256:4 95-4 97), the crioma technique, the 
human B-cell hybridoma technique (Kozbor et al., 1983, 
Immunology Today 4:72), and the E3V-hybridoma technique 
to produce human monoclonal antibodies (Cole, et al., 
1985, in Monoclonal Antibodies and Cancer Therapy, Alan 
R. Liss, Inc. , pp. 77-96) . 

Techniques described for the production of single 
chain antibodies (U.S. Patent 4,946,778) can be adapted 
to produce single chain antibodies to immunogenic enzyme 
products of this invention. Also, transgenic mice may be 
used to express humanized antibodies to immunogenic 
enzyme products of this invention. 

Antibodies generated against the enzyme of the 
present invention may be used in screening for similar 
enzymes from other organisms and samples. Such screening 
techniques are known in the art, for example, one such 
screening assay is described in "Methods for Measuring 
Cellulase Activities", Methods in Enzymology, Vol 160, 
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pp. o'^-llc, which IS hereoy ir.corpcrated by reference in 
its entirely. /^Jitibodies nay also be empicyed as a probe 
to screen gene libraries generated froir. this or o:her 
organisms zo identify this or cress reactive activities. 

5 The term "antibody, " as used herein, refers to 

intact imrr.jnoglobulin molecules, as well as fragments of 
immunoglobulin molecules, such as Fab, Fab', (Fab');,, Fv, 
and SCA fragments, that are capable of binding to an 
epitope of an amidase polypeptide. These antibody 
10 fragments, which retain some ability to selectivelv bind 
to the antigen (e.g., an amidase antigen) of the antibody 
from which they are derived, can be made using well known 
methods in the art (see, e.g., Harlow and Lane, supra), 
and are described further, as follows. 



15 (1) A Fab fragment consists of a monovalent antigen- 
binding fragment of an antibody molecule, and can be 
produced by digestion of a whole antibody molecule with 
the enzyme papain, to yield a fragment consisting of an 
intact light chain and a portion of a heavy chain. 

20 (2) A Fab' fragment of an antibody molecule can be 
obtained by treating a whole antibody molecule with 
pepsin, followed by reduction, to yield a molecule 
consisting of an intact light chain and a portion of a 
heavy chain. Two Fab' fragments are obtained per 

25 antibody molecule treated in this manner. 



(3) A (Fab') 2 fragment of an antibody can be obtained by 
treating a whole antibody molecule with the enzyme 
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pepsin, without subsequent reduction. A ;Fab'). fragr.ent 
is a dimer of two Fab' fragments, held together by two 
disulfide bonds. 

(4) An Fv fragment is defined as a genetically engineered 
5 fragment containing the variable region of a light chain 

and the variable region of a heavy chain expressed as two 
chains . 

(5) A single chain antibody ("SCA") is a genetically 
engineered single chain molecule containing the variable 

10 region of a light chain and the variable region of a 

heavy chain, linked by a suitable, flexible polypeptide 
linker . 

As used in this invention, the term "epitope" 
refers to an antigenic determinant on an antigen, such as 

15 an amidase polypeptide, to which the paratope of an 
antibody, such as an amidase-specif ic antibody, binds. 
Antigenic determinants usually consist of chemically 
active surface groupings of molecules, such as amino 
acids or sugar side chains, and can have specific three- 

20 dimensional structural characteristics, as well as 
specific charge characteristics. 

The present invention is further described with 
reference to the following examples; however, it is to be 
understood that the present invention is not limited to 
25 such examples. All parts or amounts, unless otherwise 
specified, are by weight. 
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in order zc facil::ate unders tandmc of the 
foliowmg examples certain frequently occjmng ir.ethods 
anc/or zerxs will be described. 

"Plasmids" are designated by a lower case p 
5 preceded and/or followed by capital letters and/or 
numbers. The starting piasniids herein are eirher 
commercially available, publicly available on an 
unrestricted basis, or can be constructed from available 
plasmids m accord with published procedures. In 
10 addition, equivalent plasmids to those described are 

known in the art and will be apparent zo the ordinarily 
skilled artisan. 

"Digestion" of DNA refers to catalytic cleavage of 
the DNA with a restriction enzyme that acts only at 

15 certain sequences in the DNA. The various restriction 
enzymes used herein are commercially available and their 
reaction conditions, cofactors and other reavirements 
were used as would be known to the ordinarily skilled 
artisan. For analytical purposes, typically 1 pg of 

20 plasmid or DNA fragment is used with about 2 units of 
enzyme in about 20 lil of buffer solution. For the 
purpose of isolating DNA fragments for plasmid 
construction, typically 6 to 50 ug of DNA are digested 
with 20 to 250 units of enzyme in a larger volume. 

25 Appropriate buffers and substrate amounts for particular 
restriction enzymes are specified by the manufacturer. 
Incubation times of about 1 hour at 37°c are ordinarily 
used, but may vary in accordance with the supplier's 
instructions. After digestion the reaction is 

30 electrophoresed directly on a polyacrylamide gel to 
isolate the desired fragment. 
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Size separation of the cleaved fragments is 
performed using 5 percent poiyacrylamide gel described by 
Goeddel/ 3. e: aJ., Nucleic Acids Res., 6:4057 (1980). 



"Oligonuclecrides" refers to eirher a single 
5 stranded polyceoxynucleor ide or zwc coriplementary 
poiydeoxynucleot ide strands which nay be chemically 
synthesized. Such synthetic oligonuclec::ides may or may 
not have a 5* phosphate. Those that do not will not 
ligate to another oligonucleotide without adding a 
10 phosphate with an ATP in the presence of a kinase. A 
synthetic oligonucleotide will ligate to a fragment that 
has not been dephosphorylated. 

"Ligation" refers to the process of forming 
phosphodiester bonds between two double stranded nucleic 
15 acid fragments (Maniatis et ai.. Id,, p. 146). Unless 
otherwise provided, ligation may be accomplished using 
known buffers and conditions with 10 units of T4 DNA 
ligase ("iigase") per 0.5 pg of approximately equimolar 
amounts of the DNA fragments to be ligated. 



20 Unless otherwise stated, transformation was 

performed as described in the method of Sambrook, Fritsch 
and Maniatus, 1989. 



Example 1 

Bacterial Expression and Purification of Amidase 

25 A Thermococcus GU5L5 genomic library was screened 

for amidase activity as described in Example 2 and a 
positive clone was identified and isolated. DNA of this 
clone was used as a template in a 100 ]il PGR reaction 
using the following primer sequences: 
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5' pri.Tier: CCGAGAATTC ATT.W.GAGG AGASJ^.TTA^-.C TATGACCGGC 
ATCGAATGGA 3' (SEQ ID N0:3). 3' priner: 5' A.^TAAGGATC 
CACACTGGCA CAGTGTCA^^G ACA 3' iSEQ ID NO: 4;. 

The pro-em was expressed in E. coll. The gene 
5 was amplified using PGR with the primers ir.aicatec above, 

Subsequent to amplification, the PGR product was 
cloned into the EcoRI and BamHI sites of pQETl and 
transformed by electroporation into E. coJi M15(pREP4). 
The resulting trans formants were grown up in 3ml 
10 cultures, and a portion of "his culture was induced. A 
portion of the uninduced and induced cultures were 
assayed using Z-L-?he-AMC (see below) . 

The primer sequences set out above may also be 
employed to isolate the target gene from the deposited 
15 material by hybridization techniques described above. 

Example 2 

Discovery of an amidase from Thermococcus GU5L5 

Production of the expression gene bank. 

Colonies containing pBluescript plasmids with 
20 random inserts from the organism Thermococcus GU5L5 was 
obtained according to the m.ethod of Hay and Short. (Hay, 
3. and Short, J., Strategies. 1992, 5, 16.) The 
resulting colonies were picked with sterile toothpicks 
and used to singly inoculate each of the wells of 95-well 
25 microtiter plates. The wells contained 250 uL of LB 
media with 100 yg/mL arapicillin, 30 pg/mL methicillin, 
and 10% v/v glycerol (LB Amp/Meth, glycerol). The cells 
were grown overnight at Sl'C without shaking. This 
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constituted generation of the "SourceGeneBank" ; each well 
of the Source GeneBank thus contained a stock culture of 
£. cell cells, each of which contained a pBluescript 
plasrr.id v/ith a unique DNA insert. 

5 Screening for amidase activity. 

The plates of the Source GeneBank were used to 
niultiply inoculate a single plate (the "Condensed Plate") 
containing in each well 200 ]iL of 13 Amp/Meth, glycerol. 
This step was performed using the High Density 

10 Replicating Tool (HDRT) of the Beckman Biomek with a 1% 
bleach, water, isopropanoi, air-dry sterilization cycle 
in between each inoculation. Each well of the Condensed 
Plate thus contained 10 to 12 different pBluescript 
clones from each of the source library plates. The 

15 Condensed Plate was grown for 16h at 37°C and then used 
to inoculate two white 96-well Polyf il tronics microtiter 
daughter plates containing in each well 250 \iL of LB 
Amp/Meth (without glycerol) . The original condensed 
plate was put in storage -80°C. The two condensed 

20 daughter plates were incubated at 37°C for 18 h. 

The '600 uM substrate stock solution' was prepared 
as follows: 25 mg of N-morphourea-L-phenylalanyl-l- 
amido-4-trif luoromethylcoumarin {Mu-Phe-AFC, Enzyme 
Systems Products, Dublin, CA) was dissolved in the 

25 appropriate volume of DMSO to yield a 25.2 mM solution. 
Two hundred fifty microliters of DMSO solution was added 
to ca. 9 mL of 50 mM, pH 7 . 5 Hepes buffer containing 0.6 
mg/mL of dodecyl maltoside. The volume was taken to 10,5 
mL with the above Hepes buffer to yield a cloudy 

30 solution. 
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Mu-Phe-AFC 

Fifty uL of the '600 u>: stock solution' was added 
to each of zhe wells cf a whi:e condensed pla^e using the 
Biomek to yield a final concen- ration of s'jbs"ra:e cf 
5 -ICO UN. The fluorescence values were recorded 

(excitation = 400 nm, emission = 505 nm) on a plate 
reading fluoromerer immediately after addition of the 
substrate. The plate was incubated at 70°C for 60 min. 
and the fluorescence values were recorded again. The 
10 initial and final fluorescence values were subtracted to 
determine if an active clone was present by an increase 
in fluorescence over the majority of the other wells. 

Isolation of the active clone. 

In order to isolate the individual clone which 
15 carried the activity, the Source GeneBank plates were 
thawed and the individual wells used to singly inoculate 
a new plate containing LB Amp/Meth. As above the plate 
was incubated at 37°C to grow the cells, and 50 uL of 600 
UM substrate stock solution added using the Biomek. Once 
20 the active well from the source plate was identified, the 
cells from the source plate were used to inoculate 3mL 
cultures of LB/AMP/Meth, which were grown overnight. The 
plasmid DNA was isolated from the cultures and utilized 
for sequencing and construction of expression subclones. 

25 Example 3 

Thermpcoccus GU5L5 Amidase characterizahi 

Substrate specificity. 

Using the following , substrates (see below for 
definitions of the abbreviations) : CBZ-L-ala-AI-IC, CBZ-L- 
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arg-.^-MC, CB2-L-met-A.MC, C3Z-L-p:ie-.^MC , and 7-r.et:hyl- 
umbelliferyl heptanoate az lOOxM for 1 hour az ^G°Z in 
the assays as described in the clone discovery section, 
the relative activity of the ar.idase was 3:3:1:0.1: <0.1 
5 for the compounds CBZ-L-arg-AT^C : C3Z-L-phe-AMC : CBZ-L- 
met-AMC : CBZ-L-aia-AMC : 7-methylu!rJDelli f eryl 
heptanoate. The excitation and emission wavelengths for 
the 7-amido-4-methylcouir.arins were 380 and A60 nm 
respectively, and 326 and 450 for the 
10 methylurri)ellif ercne . 

The abbreviations stand for the following 
compounds : 

CBZ-L-ala-AMC = Na-carbonylbenzyloxy-L-alanine-7- 
amido-4-methylcoumarin 
25 C32-L-arg-AMC - Na-carbonylbenzyloxy-L-arginine-7- 

amido-4-methyicouinarin 

CBZ-D-arg-AMC = Na-carbonylbenzyloxy-D-arginine-7- 
amido-4-methylcoumarin 

CBZ-L-met-AMC = Na-carbonylbenzyloxy-L-methionine- 
20 7-aiT\ido-4-:nerhylcoumarin 

CBZ-L-phe-AMC = Na-carbonylbenzy loxy-L- 
phenyiaianine-7-amido-4-methylcoujr.arin 

Organic solvent sensitivity. 

The activity of the amidase in increasing 
25 concentrations of dimethyl sulfoxide (DMSO) was tested as 
follows: to each well of a microtiter plate was added 10 

of 3 rvM CBZ-L-phe-AMC in DMSO, 25 yiL of cell lysate 
containing the amidase activity, and 250 pL of a variable 
mixture of DMSO:pH 7.5, 50 mM Hepes buffer. The 
30 reactions were heated for 1 hour at 70°C and the 
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fluorescence rieasurec. Figure 2 shows the fluorescence 
versus concentration of DM5C. The fiilec and open boxes 
represent individual assays. 

The activiry and enant loselectivi ty c: the anidase 
5 in increasing concentra cions of dimethyl formanide (DMF) 
was tested as follows: to each well of a microtiter 
plate was added 30 pL of 1 mM CB2-L-arg-AMC or CB2-D-arg- 
AMC in DMF, 30 pL of cell lysate containing the amidase 
activity, and 240 pL of a variable mixture of DMF:pH 7.5, 

10 50 mM Hepes buffer. The reactiosn were incubated at RT 
for 1 hour and the fluorescence measured at 1 .nmute 
intervals. Figure 3 shows the relative initial linear 
rates (increase in fluorescence per min, i.e., 
"activity') versus concentration of DMF for the more 

15 reactive CB2-L-arg-AMC . 

The initial linear rate ('activity') of the L and 
the CBZ-arg-AMC substrates are shown in Tables 1 and 2 
below: 



Table 1 

20 Activity of the CBZ-L- 
arg-AMC: 



DMF 


Initial 




Rate, 




Fl.U./min 


0.4% 


654 


10% 


2548 


20% 


1451 


30% 


541 


40% 


345 



Table 2 

Activity of the CB2-D- 
arg-AMC: 



DMF 


Initial 
Rate, 




Fl,U./min 


0.4% 


0.3 


10% 


10.1 


20% 


4.6 


30% 


1.8 


40% 


0.9 
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50% 


303 




190 


755 


81 


90-^ 


11 



50% 


1.2 


60% 


1.4 


15% 


0.1 


90% 


0.1 



5 The above data indicate that the enzyme shows 

excellent selectivity for the L, or 'natural' enanticrr.er 
of the derivatized amino acid substrate. 



N'jjnerous modifications and variations of the 
present invention are possible in light of the above 
10 teachings and, therefore, within the scope of the 

appended claims, the invention may be practiced otherwise 
than as particularly described. 
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SEQUENCE LISTING 

(1) GENIRAl INTORM/'iTIOr;: 

(1) A?PMCA:>IT: Reconbinant Bioca ta iysis , Ir.c. 

(ii) title; of INVErJTIGKrAnidases 

NUMBER or SEQUENCES: A 
iiv] CORRESPONDENCE /^DRESS : 

(A) ADDRESSEE: FISK & RICHARDSON 

;3; STREET: 4225 EXECUTIVE SQUARE, STE . 1403 

;C) CITY: LA JOLLA 

(Dl STATE: CA 

(E) COUNTRY: USA 

(F) ZIP: 92037 

(vl COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: 3.5 INCH DISKETTE 

;3) COMPUTER: IBM PS/2 

;C) OPERATING SYSTEM: MS-DOS 

(Dl SOFrrfARE: WORD PERFECT 6.0 

(VI) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: Unassigned 

(B) FILING DATE: Herewith 

(C) CLASSIFICATION: 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: 06/664,646 

(B) FILING DATE: 17 June 1996 

(Vlll) ATTORNEY/ AGENT INFORMATION: 

(A) NAME: LISA A. HAILE, Ph.D. 

(B) REGISTRATION NUMBER: 38,347 

(C) REFERENCE/DOCKET NUMBER: 09C10/005WO1 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 619-678-5070 

(B) TELEFAX: 619-678-5099 
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(2) INTORMATrCN FOR SEQ ID KO : 1 : 

I'l) SEQUENCE CHARACTERISTICS 

(A; LENGTH: 1869 NUCLEOTIDES 
[3) TYPE: NUCLEIC ACID 
(C) STRANDEDNESS : SINGLE 
(3) TOPOLOGV: LINEAR 

(11 ; MOLECULE TYPE: DNA 

(XII SEQUENCE DESCRIPTION: SEQ ID N0:1: 



ATG ACC GGC ATC GAA TGG AAC CAC GAG ACC TTT TCT AAG TTC GCC TAC 4 8 

Met Thr Giy He Glu Trp Asn His Giu Thr Phe Ser Lys Phe Ala Tyr 
5 10 15 

CTG GGC GAC CCG AGG ATA CGG GGA AAC TTA ATC GCG TAC ACC CTG ACG 96 

Leu Gly Asp Pro Arg He Arg Gly Asn Leu He Ala Tyr Thr Leu Thr 
20 25 3C 

AAG GCC AAC ATG AAG GAC AAC AAG TAC GAG AGC ACG GTT GTT GTT GAA 14 4 

Lys Ala Asn Met Lys Asp Asn Lys Tyr Glu Ser Thr Val Val Vai Glu 
35 40 45 

GAC CTT GAA ACG GGC TCA AGG CGC TTC ATC GAG AAC GCC TCA ATG CCG 192 

Asp Leu Giu Thr Gly Ser Arg Arg Phe He Glu Asn Ala Ser Met Pro 
50 55 60 

AGG ATT TCG CCA GAC GGC AGA AAG CTC GCC TTC ACC TGC TTT AAC GAG 240 

Arg He Ser Pro Asp Gly Arg Lys Leu Ala Phe Thr Cys Phe Asn Glu 



65 



70 75 80 



GAG AAG AAG GAG ACC GAG ATA TGG GTG GCC GAT ATC CAG ACC CTG AGC 288 
Glu Lys Lys Glu Thr Glu He Trp Val Ala Asp He Gin Thr Leu Ser 
85 90 95 

GCC AAG AAA GTC CTC TCA ACT AAA AAC GTC CGC TCG ATG CAG TGG AAC 336 
Ala Lys Lys Val Leu Ser Thr Lys Asn Val Arg Ser Met Gin Trp Asn 
ICC 105 110 

GAC GAT TCA AGG AGA CTC TTA GTT GTC GGC TTC AAG AGG AGG GAC GAT 384 
ASP Asp Ser Arg Arg Leu Leu Val Val Gly Phe Lys Arg Arg Asp Asp 
115 120 125 

GAG GAC TTC GTC TTT GAC GAC GAC GTC CCG GTC TGG TTC GAC AAT ATG 432 
Glu Asp Phe Val Phe Asp Asp Asp Val Pro Val Trp Phe Asp Asn Met 
130 135 140 

GGA TTC TTT GAT GGA GAG AAG ACG ACG TTC TGG GTT CTT GAC ACT GAG 480 
Glv Phe Phe Asp Gly Glu Lys Thr Thr Phe Trp Val Leu Asp Thr Glu 
145 15C 155 160 

GCC GAG GAG ATA ATC GAG CAG TTC GAG AAG CCG AGG TTT TCG AGT GGC 528 
Ala Glu Glu He He Glu Gin Phe Glu Lys Pro Arg Phe Ser Ser Gly 
165 l^C 175 

CTC TGG CAC GGC GAT GCG ATA GTT GTG AAC GTC CCG CAC CGC GAG GGG 57 6 

Leu '^rp His Giy Asp Ala He Val Val Asn Val Pro Kis Arg Glu Gly 
180 185 190 
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AGC AA3 CCT 3CC CTG TTC AAG TTC TAG GAG ATA GTC CTA TGG AAG GAG 62 4 

Ser lys Pro Ala Leu Phe Lys Phe Tyr Asd lie Val leu Trp Lys Asp 
195 200 ' 205 

GGG GAG GAA GAG AAG C7C TTC GAG AGG GTC TCC TTC GAG GCG GTT GAC 6 72 

Gly Glu Giu Giu Lys Leu Phe Giu Arg Vai Ser Phe Glu Ala Vai Asp 
210 215 220 

TCC GAC GGA AAG AGA ATA CTC CTG AGG G3C AAG AAA AAA AAG CGG TTC 120 
Ser Asp Gly Lys Arg He Leu Leu Arg Gly Lys Lys Lys Lys Arg Phe 
225 230 235 ' 240 

ATC AGC GAG CAC GAC TGG CTG TAG CTC TGG GAC GGC GAG CTT AAA CCG 7 68 

He Ser Glu His Asp Trp Leu Tyr Leu Trp Asp Gly Giu Leu Lys Pro 
245 250 255 

ATC TAC GAG GGC CCG CTC GAC GTC TGG GAA GCC AAG CTC ACG GAA GGA 816 
lie Tyr Giu Gly Pro Leu Asp Val Trp Giu Ala Lys Leu Thr Glu Gly 
260 265 270 

AAG GTC TAC TTC CTC ACT CCA GAT GCG GGC AGG GTA AAC CTC TGG CTC 8 64 

Lys Val Tyr Phe Leu Thr Pro Asp Ala Gly Arg Val Asn Leu Trp Leu 
275 280 295 

TGG GAC GGG AAG GCC GAG CGT GTT GTT ACC GGC GAC CAC TGG ATT TAC 912 
Trp Asp Gly Lys Ala Glu Arg Vai Val Thr Gly Asp His Trp He Tyr 
290 295 300 

GGG CTT GAC GTC AGC GAT GGC AAA GCA TTG CTC CTC ATC ATG ACC GCC 960 
Gly Leu Asp Val Ser Asp Gly Lys Ala Leu Leu Leu He Met Thr Ala 
305 310 315 320 

ACG AGG ATA GGC GAG CTC TAC CTC TAC GAC GGC GAG CTG AAA CAG GTC 1008 
Thr Arg He Gly Glu Leu Tyr Leu Tyr Asp Gly Glu Leu Lys Gin Val 
325 330 335 

ACC GAA TAC AAC GGG CCG ATA TTC AGG AAG CTC AAG ACC TTC GAG CCG 1056 
Thr Glu Tyr Asn Gly Pro He Phe Arg Lys Leu Lys Thr Phe Glu Pro 
340 345 350 

AGG CAC TTC CGC TTC AAG AGC AAA GAC CTC GAG ATA GAC GGC TGG TAC 1104 
Arg His Phe Arg Phe Lys Ser Lys Asp Leu Glu He Asp Gly Trp Tyr 
355 360 365 

CTC AGG CCG GAG GTT AAA GAG GAG AAG GCC CCG GTG ATA GTC TTC GTC 1152 
Leu Arg Pro Glu Val Lys Glu Glu Lys Ala Pro Val He Val Phe Val 
370 375 380 

CAC GGC GGG CCG AAG GGC ATG TAC GGA CAC CGC TTC GTC TAC GAG ATG 1200 

His Gly Gly Pro Lys Gly Met Tyr Gly His Arg Phe Val Tyr Glu Met 

385 390 395 400 

CAG CTG ATG GCG AGC AAG GGC TAC TAC TGC TGC TTC GTG AAC CCG CGC 1248 

Gin Leu Met Ala Ser Lys Gly Tyr Tyr Val Val Phe Vai Asn Pro Arg 

405 410 415 

GGC AGC GAC GGC TAT AGC GAA GAC TTC GCG CTC CGC GTC CTG GAG AGG 12 96 

Gly Ser Asp Gly Tyr Ser Glu Asp Phe Ala Leu Arg Val Leu Glu Arg 
420 426 430 
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ACT GGC T7G GAG GAC TTT GAG GAC ATA ATG AAC GGC ATC GAG GAG TTC 134^ 
Thr Giy Lej Glu Asp Fhe Giu Asp lie Met Asn Gly lie Glu GIj Phe 
435 440 445 

TTC AAG CTC GAA CCG CAG GCC GAC AGG GAG CGC GTT GGA ATA ACG GGC 1392 
?he Lys Leu Glu Pro Gin Ala Asp Arg Glu Arg Val Gly He Thr Giy 
450 455 460 

ATA AGC TAC GGC GGC TTC ATG ACC AAC TGG GCC TTG AC? CAG AGC GAC 1440 
lie Ser Tyr Gly Gly Phe Me*^ "'hr Asn Trp Ala Leu Thr Gin Ser Asp 
465 470 475 480 

CTC TTC AAG GCA GGA ATA AGC GAG AAC GGC ATA AGC TAC TGG CTC ACC 14 8 8 

Leu Phe Lys Ala Gly He Ser Glu Asn Gly He Ser Tyr Trp Leu Thr 
465 490 495 

AGC TAC GCC TTC TCG GAC ATA GGG CTC TGG TAC GAC GTC GAG GTC ATC 1536 
Ser Tyr Ala Phe Ser Asp lie Giy Leu Trp Tyr Asp Val Glu Val lie 
500 505 510 

GGG CCA AAT CCG TTA GAG AAC GAG AAC TTC AGG AAG CTC AGC CCG CTG 1584 
Gly Pro Asn Pro Leu Giu Asn Glu Asn Phe Arg Lys Leu Ser Pro Leu 
515 520 525 

TTC TAC GCT CAG AAC GTG AAG GCG CCG ATA CTC CTA ATC CAC TCG CTT 1632 
Phe Tyr Ala Gin Asn Vai Lys Ala Pro He Leu Leu He His Ser Leu 
530 535 540 

GAG GAC TAC CGC TGT CCG CTC GAC CAG AGC CTT ATG TTC TAC AAC GTG 1680 
Glu Asp Tyr Arg Cys Pro Leu Asp Gin Ser Leu Met Phe Tyr Asn Vai 
545 550 555 560 

CTC AAG GAC ATG GGC AAG GAA GCC TAC ATA GCG ATA TTC AAG CGC GGC 1728 
Leu Lys Asp Met Gly Lys Glu Ala Tyr He Ala He Phe Lys Arg Gly 
565 570 575 

GCC CAC GGC CAC AGC GTC CGC GGA AGC CCG AGG CAC AGG CCG AAG CGC 177 6 

Ala His Gly His Ser Val Arg Gly Ser Pro Arg His Arg Pro Lys Arg 
580 585 590 

TAC AGG CTC TTC ATA GAG TTC TTC GAG CGC AAG CTC AAG AAG TAC GAG 1824 
Tyr Aj:g Leu Phe He Giu Phe Phe Glu Arg Lys Leu Lys Lys Tyr Glu 
595 600 605 

GAG GGC TTT GAG GTA GAG AAG ATA CTC AAG GGG AAT GGG AAC TGA 1869 
Glu Gly Phe Glu Val Glu Lys He Leu Lys Gly Asn Gly Asn 
610 615 620 



(2) INTORMATION FOR SEQ ID NO: 2: 

(1) SEQUENCE CHARACTERISTICS 

(A) LENGTH: 622 AMINO ACIDS 

(B) TYPE: AMINO ACID 

(C) STRANDEDNESS : 

(D) TOPOLOGY: LINEAR 

{ill MOLECULE TYPE: PROTEIN 
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:xi) SECUENCi DESCRIPTION: SE"^ ID NC : 2 ; 

Met: Thr Gly lie Giu Trp Asn His Glu Thr Phe Ser Lys Pne Ala Tvr 
5 ' IC 15 ' 

Leu Giy Asp Pro Arg lie Arg Gly Asn Leu lie Ala ?yr Thr Le-j Thr 
20 25 3C 

Lys Ala Asn Met Lys Asp Asn Lys Tyr Glu Ser Thr Val Val Vai Glu 
35 40 15 

Asp Leu Glu Thr Gly Ser Arg Arg Phe lie Giu Asn Ala Ser .Met Pro 
50 55 60 

Arg lie Ser Pro Asp Gly Arg Lys Leu Aia Phe Thr Cys Phe Asn Glu 
65 70 75 80 

Glu Lys Lys Glu Thr Glu lie Trp Val Ala Asp lie Gin Thr Leu Ser 
85 90 95 

Ala Lys Lys Val Leu Ser Thr Lys Asn Val Arg Ser Met Gin Trp Asn 
100 105 110 

Asp Asp Ser Arg Arg Leu Leu Val Val Gly Phe Lys Arg Arg Asp Asp 
115 120 125 

Glu Asp Phe Val Phe Asp Asp Asp Val Pro Val Trp Phe Asp Asn Met 
130 135 140 

Gly Phe Phe Asp Gly Glu Lys Thr Thr Phe Trp Val Leu Asp Thr Glu 
145 150 155 160 

Ala Glu Glu lie lie Glu Gin Phe Glu Lys Pro Arg Phe Ser Ser Gly 
165 170 175 

Leu Trp His Gly Asp Ala lie Val Val Asn Val Pro His Arg Glu Gly 
180 185 190 

Ser Lys Pro Ala Leu Phe Lys Phe Tyr Asp lie Val Leu Trp Lys Asp 
195 200 205 

Gly Glu Glu Glu Lys Leu Phe Glu Arg Val Ser Phe Glu Ala Val Asp 
210 215 220 

Ser Asp Gly Lys Arg lie Leu Leu Arg Gly Lys Lys Lys Lys Arg Phe 
225 230 235 240 

lie Ser Glu His Asp Trp Leu Tyr Leu Trp Asp Gly Glu Leu Lys Pro 
245 250 255 

lie Tyr Glu Giy Pro Leu Asp Val Trp Glu Ala Lys Leu Thr Glu Gly 
260 265 270 

Lys Val Tyr Phe Leu Thr Pro Asp Ala Gly Arg Val Asn Leu Trp Leu 
275 280 285 

Trp Asp Giy Lys Ala Giu Arg Val Vai Thr Gly Asp His Trp lie Tyr 
290 295 300 



Gly Leu Asp Vai Ser Asp Gly Lys Ala Leu Leu Leu lie Met Thr Ala 
305 310 315 320 
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Thr Arg Zle Gly Giu Le'j Tyr Le-j Tyr Asp Gly GIj Leu Lys Gin Val 
325 530 335 

Thr Glu Tyr Asn Gly Pre lie Phe Arg Lys Leu Lys Thr Phe Glu Pro 
340 345 350 

Arc His Phe Arg Phe Lys Ser Lys Asp Leu Glu lie Asp Gly Trp Tyr 
353 360 365 

Leu Arg Pro Glu Val Lys Glu Glu Lys Ala Pro Val lie Val Phe Val 
370 375 380 

His Gly Gly Pro Lys Gly Met Tyr Gly His Arg Phe Val Tyr Glu Met 
385 390 395 400 

Gin Leu Met Ala Ser Lys Gly Tyr Tyr Val Val Phe Val Asn Pro Arg 
405 410 415 

Gly Ser Asp Gly Tyr Ser Glu Asp Phe Ala Leu Arg Val Leu Glu Arg 
420 425 430 

Thr Gly Leu Glu Asp Phe Glu Asp lie Met Asn Gly lie Giu Glu Phe 
435 440 445 

Phe Lys Leu Glu Pro Gin Ala Asp Arg Glu Arg Val Gly lie Thr Gly 
450 455 460 

lie Ser Tyr Gly Gly Phe Met Thr Asn Trp Ala Leu Thr Gin Ser Asp 
465 470 475 480 

Leu Phe Lys Ala Gly lie Ser Glu Asn Gly lie Ser Tyr Trp Leu Thr 
485 490 495 

Ser Tyr Ala Phe Ser Asp lie Gly Leu Trp Tyr Asp Val Glu Val lie 
500 505 510 

Gly Pro Asn Pro Leu Giu Asn Glu Asn Phe Arg Lys Leu Ser Pro Leu 
515 520 525 

Phe Tyr Ala Gin Asn Val Lys Ala Pro lie Leu Leu lie His Ser Leu 
530 535 540 

Glu Asp Tyr Arg Cys Pro Leu Asp Gin Ser Leu Met Phe Tyr Asn Val 
545 550 555 560 

Leu Lys Asp Met Gly Lys Glu Ala Tyr lie Ala lie Phe Lys Arg Gly 
565 570 575 

Ala His Gly His Ser Val Arg Gly Ser Pro Arg His Arg Pro Lys Arg 
580 585 590 

Tyr Arg Leu Phe lie Glu Phe Phe Glu Arg Lys Leu Lys Lys Tyr Glu 
595 600 605 



Glu Gly Phe Glu Val Giu Lys lie Leu Lys Gly Asn Gly Asn 
610 615 620 
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(2, :nformatiok fcr szq id N-0:3; 

ii: SECUENCE CHARACTERISTICS 

(A; LENGTH: 53 NUCLEOTIDES 
(B; TYPE: NUCLEIC ACID 
(C: STRANDEDNESS : SINGLE 
(D) TOPOLOGY: LINEAR 

(11) MOLECULE TYPE: 01 igonucleor ide 



(XI) SEQUENCE DESCRIPTION: SEQ ID NO : 3 : 

CCGAGAATTC ATTAAAGAGG AGAAATTAAC TATGACCGGC ATCGAAT3GA 5C 



(2; INFORMATION FOR SEQ ID NO : 4 : 

(1) SEQUENCE CHARACTERISTICS 

(A) LENGTH: 33 NUCLEOTIDES 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS : SINGLE 

(D) TOPOLOGY: LINEAR 

(ix) MOLECULE TYPE: Oligonucleotide 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 



AATAAGGATC CACACTGGCA CAGTGTCAAG ACA 



33 
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What Is Claimed Is: 

1. An isclated polynuclectice which encodes the amino 
acid sequence ser fcrch in SEQ ID NO: 2. 

2. An isolated polynucleotide selected from the group 
consisting of: 

a) SEQ ID N0:1; 

b) SEQ ID N0:1, wherein T can also be U; 

c) nucleic acid sequences complementary to a) and b) ; 
and 

d) fragments of a), b), or c) that are at least 15 
bases in length and that will hybridize to DNA 
which encodes the amino acid sequence of SEQ ID 
NO: 2 . 

3. The polynucleotide of claim 1, wherein the polynu- 
cleotide is isolated from a prokaryote. 

4. An expression vector including the polynucleotide 
of claim 1. 

5. The vector of claim 4, wherein the vector is a 
plasmid. 

6. The vector of claim 4, wherein the vector is a 
virus-derived. 

7. A host cell transformed with the vector of claim 
4 . 
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8. The host cell cf claim wherein zr.e cell is 

prokaryoiic . 

5. The poiynucleocide cf ciaim 1 which encodes the 

enzy.^e cOiXprisinc a.xino acid 1 to 622 of SEQ ID 
N0;2. 

10. The polynucleotide cf ciain 1 comprising the 
sequence as set forth in SEQ ID N0:1 from 
nucleotide 1 to nucleotide 1866. 

11. A substantially pure polypeptide selected from the 
group consisting of: 

a) an enzyme comprising an amino acid sequence 
which is at least 70% identical to the amino 
acid sequence set forth in SEQ ID N0:2; 

b) an enzyme which comprises at least 30 amino 
acid residues to the enzyme of a) ; and 

c) the amino acid sequence as set lorth in SEQ 
ID N0:2. 

12. Antibodies that bind to the polypeptide of claim 
11. 

13. The antibodies of claim 12, wherein the antibodies 
are polyclonal. 



The antibodies of claim 12, wherein the antibodies 
are monoclonal. 
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15. A mezhcd for procucing an enzyme comprisinc 
growing a host cell cf claim 7 under conditions 
which allow the expression of the nucleic acid and 
isolating the enzyrr.e encoded by the nucleic acid. 

16. A process for producing a recombinant cell 
comprising transforming or transfecting the cell 
with the vector of claim A such that the cell 
expresses a polypeptide encoded by the DNA 
contained in the vector. 

17. A process for removal of arginine phenylalanine or 
methionine from the N-terminal end of peptides in 
peptide or peptidomimetic synthesis, comprising: 
administering an amount of the enzyme of claim IC 
effective for removal of arginine phenylalanine or 
methionine from the N-terminal end of peptides in 
peptide or peptidomimetic synthesis. 
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Figure 1 



Thermococcus GU5L5 Amidase 



: ATG ACC GGC ATC GAA 7GG AAC CAC GAG ACC TTT TCT AAG TTC GCC TAC 
CTG GGC GAC CCG 6 0 

: Met Thr Gly lie Glu Trp Asn His Glu Thr Phe Ser Lys Phe Ala Tyr 
Leu Gly Asp Pro 20 



61 AGG ATA CGG GGA AAC TTA ATC GCG TAC ACC CTG ACG AAG GCC AAC ATG 
AAG GAC AAC AAG 120 

21 Arg lie Arg Gly Asn Leu lie Ala Tyr Thr Leu Thr Lys Ala Asn Met 
Lys Asp Asn Lys 4 0 



121 TAC GAG AGC ACG GTT GTT GTT GAA GAC CTT GAA ACG GGC TCA AGG CGC 
TTC ATC GAG AAC 180 

41 Tyr Glu Ser Thr Val Val Val Glu Asp Leu Glu Thr Gly Ser Arg Arg 
Phe He Glu Asn 60 



181 GCC TCA ATG CCG AGG ATT TCG CCA GAC GGC AGA AAG CTC GCC TTC ACC 
TGC TTT AAC GAG 24 0 

61 Ala Ser Met Pro Arg lie Ser Pro Asp Gly Arg Lys Leu Ala Phe Thr 
Cys Phe Asn Glu 80 



241 GAG AAG AAG GAG ACC GAG ATA TGG GTG GCC GAT ATC CAG ACC CTG AGC 
GCC AAG AAA GTC 3 00 

Bl Glu Lys Lys Glu Thr Glu He Trp Val Ala Asp He Gin Thr Leu Ser 
Ala Lys Lys Val 100 



301 CTC TCA ACT AAA AAC GTC CGC TCG ATG CAG TGG AAC GAC GAT TCA AGG 
AGA CTC TTA GTT 36 0 

101 Leu Ser Thr Lys Asn Val Arg Ser Met Gin Trp Asn Asp Asp Ser Arg 
Arg Leu Leu Val 12 0 
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361 GTC GGC TTC AAG AGG AGG GAC 
GTC CCG GTC TGG 420 

12* Vai Gly Phe Lys Arg Arg Asp 
Val Pro Val Trp 140 

421 TTC GAC AAT ATG GGA TTC TTT 
CTT GAC ACT GAG 480 

141 Phe Asp Asn Met Gly Phe Phe 
Leu Asp Thr Glu 160 

481 GCC GAG GAG ATA ATC GAG CAG 

CTC TGG CAC GGC 54 0 

161 Ala Glu Glu He He Glu Gin 

Leu Trp His Gly 180 

541 GAT GCG ATA GTT GTG AAC GTC 
CTG TTC AAG TTC 600 

181 Asp Ala He Val Val Asn Val 
Leu Phe Lys Phe 200 

6 01 TAG GAC ATA GTC CTA TGG AAG 

AGG GTC TCC TTC 660 

2 01 Tyr Asp He Val Leu Trp Lys 

Arg Val Ser Phe 220 

661 GAG GCG GTT GAC TCC GAC GGA 

AAA AAG CGG TTC 720 

221 Glu Ala Val Asp Ser Asp Gly 

Lys Lys Arg Phe 24 0 

721 ATC AGC GAG CAC GAC TGG CTG 
ATC TAC GAG GGC 780 

241 He Ser Glu His Asp Trp Leu 
He Tyr Glu Gly 260 
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GAT GAG GAC TTC GTC TTT GAC GAC GAC 

Asp Glu Asp Phe Val Phe Asp Asp Asp 

GAT GGA GAG AAG ACG ACG TTC TGG GTT 
Asp Gly Glu Lys Thr Thr Phe Trp Val 

TTC GA3 AAG CCG AGG TTT TCG AGT GGC 
Phe Glu Lys Pro Arg Phe Ser Ser Gly 

CCG CAC CGC GAG GGG AGC AAG CCT GCC 
Pro His Arg Glu Gly Ser Lys Pro Ala 

GAC GGG GAG GAA GAG AAG CTC TTC GAG 
Asp Gly Glu Glu Glu Lys Leu Phe Glu 

AAG AGA ATA CTC CTG AGG GGC AAG AAA 
Lys Arg He Leu Leu Arg Gly Lys Lys 

TAC CTC TGG GAC GGC GAG CTT AAA CCG 
Tyr Leu Trp Asp Gly Glu Leu Lys Pro 
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791 CC3 CTC GAC GTC TGG GAA GCC AAG CTC ACG GAA QGA AAG GTC TAG TTC 
CTC ACT CCA GAT B^O 

261 Pro Leu Asp Val Trp Giu Ala Lys Leu Thr Glu Gly Lya Val Tyr Phe 
Leu Thr Pro Asp 280 



841 GCG GGC AGG GTA AAC CTC TGG CTC TGG GAC GGG AAG GCC GAG CGT GTT 
GTT ACC GGC GAC 90C 

281 Ala Gly Arg Val Asn Leu Trp Leu Trp Asp Gly Lys Ala Glu Arg Val 
Val Thr Gly Asp 3 00 



901 CAC TGG ATT TAC GGG CTT GAC GTC AGC GAT GGC AAA GCA TTG CTC CTC 
ATC ATG ACC GCC 96 0 

301 His Trp He Tyr Gly Leu Asp Val Ser Asp Gly Lys Ala Leu Leu Leu 
He Met Thr Ala 320 



961 ACG AGG ATA GGC GAG CTC TAC CTC TAC GAC GGC GAG CTG AAA CAG GTC 
ACC GAA TAC AAC 1020 

321 Thr Arg He Gly Glu Leu Tyr Leu Tyr Asp Gly Glu Leu Lys Gin v4l 
Thr Glu Tyr Asn 340 



1021 GGG CCG ATA TTC AGG AAG CTC AAG ACC TTC GAG CCG AGG CAC TTC CGC 
TTC AAG AGC AAA 108 0 

341 Gly Pro He Phe Arg Lys Leu Lys Thr Phe Glu Pro Arg His Phe Arg 
Phe Lys Ser Lys 36 0 



1081 GAC CTC GAG ATA GAC GGC TGG TAC CTC AGG CCG GAG GTT AAA GAG GAG 
AAG GCC CCG GTG 114 0 

361 Asp Leu Glu He Asp Gly Trp Tyr Leu Arg Pro Glu Val Lys Glu Glu 
Lys Ala Pro Val 3 80 



1141 ATA GTC TTC GTC CAC GGC GGG CCG AAG GGC ATG TAC GGA CAC CGC TTC 
GTC TAC GAG ATG 12 00 

381 He Val Phe Val His Gly Gly Pro Lys Gly Met Tyr Gly His Arg Phe 
Val Tyr Glu Met 400 



1201 CAG CTG ATG GCG AGC AAG GGC TAC TAC GTC GTC TTC GTG AAC CCG CGC 
GGC AGC GAC GGC 126 0 

401 Gin Leu Mec Ala Ser Lys Gly Tyr Tyr Val Val Phe Val Asn Pro Arg 
Gly Ser Asp Gly 420 
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12 61 TAT AGC 3AA GAC 
GAC TTT GAG GAC 13 2 0 

421 Tyr Ser Glu Asp 
Asp Phe Glu Asp 440 

13 21 ATA ATG AAC GGC 
AGG GAG CGC GTT 1380 

441 lie Met Asn Gly 
Arg Glu Arg Val 460 

13 81 GGA ATA ACG GGC 
ACT CAG AGC GAC 1440 

461 Gly He Thr Gly 
Thr Gin Ser Asp 480 

1441 CTC TTC AAG GCA 
AGC TAG GCC TTC 150 0 

4 81 Leu Phe Lys Ala 
Ser Tyr Ala Phe 500 

1501 TCG GAC ATA GGG 
TTA GAG AAC GAG 156 0 

501 Ser Asp He Gly 
Leu Glu Asn Glu 520 



1561 AAC TTC AGG AAG 
CCG ATA CTC CTA 1620 

521 Asn Phe Arg Lys 
Pro lie Leu Leu 540 



1621 ATC CAC TCG CTT 
TTC TAG AAC GTG 16 80 

541 He His Ser Leu 
Phe Tyr Asn Val 56 0 



A/6 

TTC GCG CTC CGC GTC CTG 
Phe Ala Leu Arg Val Leu 

ATC GAG GAG TTC TTC AAG 
He Glu Glu Phe Phe Lys 

ATA AGC TAC GGC GGC TTC 
He Ser Tyr Gly Gly Phe 

GGA ATA AGC GAG AAC GGC 
Gly He Ser Glu Asn Gly 

CTC TGG TAC GAC GTC GAG 
Leu Trp Tyr Asp Val Glu 

CTC AGC CCG CTG TTC TAC 
Leu Ser Pro Leu Phe Tyr 

GAG GAC TAC CGC TGT CCG 
Glu Asp Tyr Arg Cys Pro 



GAG AGG ACT GGC TTG GAG 
Glu Arg Thr Gly Leu Glu 

CTC GAA CCG CAG GCC GAC 
Leu Glu Pro Gin Ala Asp 

ATG ACC AAC TGG GCC TTG 
Mec Thr Asn Trp Ala Leu 

ATA AGC TAC TGG CTC AC?C 
He Ser Tyr Trp Leu Thr 

GTC ATC GGG CCA AAT CCG 
Val He Gly Pro Asn Pro 

GCT CAG AAC GTG AAG GCG 
Ala Gin Asn Val Lys Ala 

CTC GAC CAG AGC CTT ATG 
Leu Asp Gin Ser Leu Met 



16 81 CTC AAG GAC ATG GGC AAG GAA GCC TAC ATA GCG ATA TTC AAG CGC GGC 
GCC CAC GGC CAC 17 4 0 
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561 Leu Lys Asp Me- Gly Lys Giu 
Ala His Gly His 580 

174 1 AGO GTC CGC GGA AGC CCG AGG 
ATA GAG TTC TTC 18 00 

561 Ser Val Arg Gly Se . Pro Arg 
lie Glu Phe Phe 600 



1801 GAG CGC AAG CTC AAG AAG TAC 
CTC AAG GGG AAT 186 0 

601 Glu Arg Lya Leu Lys Lys Tyr 
Leu Lys Gly Asn 6 20 

1861 GGG AAC TGA 1869 
621 Gly Asn End 623 



Ala Tyr lie Ala He Phe tys Arg Gly 

CAC AGG CCG AAG CGC TAC AGG CTC TTC 
His Arg Pro Lys Arg Tyr Arg Leu Phe 

GAG GAG GGC TTT GAG GTA GAG AAG ATA 
Glu Glu Gly Phe Glu Val Glu Lys lie 
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Activity of GU5L5 Amidase with 
CBZ-Phe-AMC vs DMSO 




% DMSO 



Figure 2 
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Figure 3 



INTERNATIONAL SEARCH REPORT 



InU^TutMiul appiictlioa No. 
PCTAJS97/09319 



A. CLASSinCATION OF SUBJECT MATTER 

IPC(6) : C12N 9/80. 15/00. 1/20; C12P 21/06; C07H 21/04; C07K 16/00 
US CL :Pleue See ExUi Sheet. 
According Lo biemitionAJ P&tent CUiiiliciuon (IPC) or to both rulionti cUijificAtion and IPC 

B. nELDS SEARCHED 

Minimum documeoution learcbcd (cUsiiiicAtion lyitem followed by cluii/ication lyroboU) 

U.S. : 433/228, 69.1. 252.3. 320.1. 68.1; 536/23.2, 23.7; 530/387.1, 388.1 

Docuroenuiioo icvched other thin minimum documentation to the extent that luch documenu arc included in the fieldj searched 
MPirch pp • protein daUbaic icarch • genejeq25. 

Electronic dau bate conjuhed during the intemationai cearch (name of daXa baae and, where practicable, learch tcnni uaed) 
ptease See Extra Sheet. 



C. DOCUNfENTS CONSIDERED TO BE RELEVANT 



Category* 



Citation of document, with indication, where appropriate, of the relevant paaaagea 



Relevant to claioi No. 



US 5,451.522 A (QUEENER et al.) 19 September 1995, see 
entire document. 



M7 



|~| Pujther documenu arc Listed in the continuation of Box C. Sco patent Csmily annex. 



^mfnl rttfinmi tfa« |Coenl MU of th* tit wkich ■ ool ooo*idar«d 
to be af puttcuW nicvtoc* 

•X* 

ewiicr docuDoi publichad m or tAcr ttu gJrmarinnal fUi&t 4i» 

^^rr^irr^ wlucfa aay ttM«w dowte oa pnom]r cyuai<i) or wkka m 
coad to ftiahliffr ihe pubiifuwi (Uio of MOtbar cit«iioo or oAkcr 

MQ (H ipeckkd) " 



d«io Mi B ooaAkt wkk ■ppUoboa bui citorf to MiiM 
priac^ or (faMcy ndcrtyinc iko iiiMiiiM 

J u smnmt of puiicukr nlmtooa; la* floanad ia»wri>to c 

OOOMdcnd MOVai or QMMOl W Q 



docuMMt rs/crnnf to ond dMcloMu*. we, «Kiubitiao or ata«r 
^fr*— « pubL«bod prior to tbo intoraotiowU fdin« i»U but ktor ihu 




Date of the actual completion of the international search 
n AUGUST 1997 


Date of mailing of the intemationai search report 

0 3 SEP 1997 


Name and mailing address of the ISA/US 
Commisaioocr of Ptuou axid Trmdeourkj 

Box per 

Washioguw. D.C. 20231 
Facsimile No. r703) 305-3230 


Authorized officer , ^ — ^ 

Tckd-ndSudh. SWWV^/:)./' 
Telephone No. (703) 30ft-0196 f \^ 



Form PCT/lSA/2ia (second iheetKJuly 1992)* 



